1 Aim

Analysis of KDM5 copy number and response to KDM5 did show a correlation. Here this relationship is visualized and investigated in a larger collection of samples. Additionally, the drug response is correlated with expression of KDM5.

Only the KDM5A and KDM5B paralogues are investigated. KDM5C and KDM5D are not included because they are located on the X and Y chromosomes respectively. Sex chromosomes are routinely excluded from CNV estimation because of their complex behavior.

2 Available data

2.1 Drug sensitivity

Sensitivity to KDM5 inhibitor CPI-455 was determined for the tumoroids of the NETG1G2 cohort. Also for some cell lines this information may be available but needs to be retrieved first.

2.2 Copy number

Copy number data is available for the samples of the NETG1G2 cohort, the samples from DiDomenico 2020 and several cell lines (BON1, QGP1, NT3)

Copy number analysis was not performed for all samples of DiDomenico 2020. While it is possible to add the information for the missing samples this is not done at the moment because the published CNV information was validated experimentally. Adding the missing samples may leave the impression that they were confirmed experimentally too.

# These functions are taken from 044_Cell_line_hypoxia
# They are used to extract the information about loci of interest from a saved conumee object
# Some arguments were static and now are function

# genes of interest
detail_regions = GRanges(seqnames = c("chr11", "chr12", "chr1", "chr1", "chr19", "chr9"), 
                         ranges = IRanges(start = c(64570982, 389295, 202696526, 44115829, 4969125, 6720863), 
                                          end = c(64578766, 498620, 202778598, 44171186, 5153606, 7175648)), 
                         strand = c("*"),
                         name = c("MEN1", "KDM5A", "KDM5B", "KDM4A", "KDM4B", "KDM4C"))

# The information about the cromosome arms is used later to generate heatmaps
hg19_arms = read_delim("/Bioinformatics/projects/shared_data/hg19_UCSC_cytoBand.txt.gz", 
                       delim = "\t", 
                       col_names = c("Chromosome", "Start", "End", "Band", "Giemsa"),
                       show_col_types = F)

hg19_arms = hg19_arms %>% 
  mutate(Arm = str_sub(Band, 1, 1)) %>%
  group_by(Chromosome, Arm) %>% 
  summarize(Start = min(Start),
            End = max(End), 
            Length = End - Start, 
            .groups = "drop")

createAnno = function(data_type){
  data("exclude_regions")
  # The array type is set to overlap because I'm using both EPIC and 450K data
  CNV.create_anno(array_type = data_type, 
                  exclude_regions = exclude_regions,
                  detail_regions = detail_regions)
}

extractCNABins = function(CNA_data){
  # for each samples the bins are extracted and merged into one data frame
  raw_bins = bind_rows(lapply(names(CNA_data), function(x){
    bin_df = CNV.write(CNA_data[[x]], what = "bins")
    colnames(bin_df)[5] = "CNV"
    bin_df$Sample_Name = x
    return(bin_df)
  }))
  # The bins are annotated with the chromosome arms 
  # Most likely there is a very elegant solution using GRanges and Rtracklayer 
  raw_bins = bind_rows(lapply(unique(raw_bins$Chromosome), function(x){
    current_chr = raw_bins %>% 
      filter(Chromosome == x)
    current_p = hg19_arms %>% 
      filter(Chromosome == x & Arm == "p")
    current_chr %>% 
      mutate(Arm = ifelse(End < current_p$End, "p", "q"))
  }))
}

getCNVvalues = function(gene_name,
                        CNV_bins,
                        anno){
  gene_region = detail_regions[detail_regions$name == gene_name]
  gene_bins = findOverlaps(gene_region, anno@bins, select = "all")
  gene_bins = names(anno@bins[subjectHits(gene_bins)])
  
  CNV_bins %>% 
    filter(Feature %in% gene_bins) %>% 
    group_by(Sample_Name) %>% 
    mutate(Start = min(Start),
           End = max(End)) %>% 
    ungroup() %>% 
    pivot_wider(names_from = Feature,
                values_from = CNV,
                names_glue = "bin_{Feature}") %>% 
    mutate(average_CNV = rowMeans(across(starts_with("bin"), ~ .x))) %>% 
    dplyr::select(1:5,
                  average_CNV,
                  starts_with("bin"))
}

2.3 Gene expression

All expression data is taken from 046_RNASeq_count_data_collection. For all RNA-Seq data sets the raw counts and normalized data is available.

Because data is compared across diverse data sets the TMM normalization is chosen. This normalizes each sample by the library size making the counts more comparable across experiments. However, differences in library composition may systematically skew the TMM values between experiments. Therefore, comparison between experiments should be done carefully.

Wherever possible data mapped to GRCh38 is used.

3 NETG1G2

For the samples in this cohort drug response was measured on the tumoroids. DNA methylation data is only available for the primary tumors and the drug sensitivity information is transferred from the tumoroids. In turn tumoroids are assigned the same CNV as the matched tumors. Expression data is available for some tumors and tumoroids.

In addition, the DAXX / ATRX / MEN1 mutation status was investigated. All available samples are DAXX / ATRX mutated. For MEN1 no muation data is available for these samples. Therefore, no figures are shown.

# For each data set there is a separate import function to handle the many differnet formats
loadNETG1G2 = function(){
  meth_dir = "/Bioinformatics/projects/030_NETG1G2_EPIC"
  
  meta_meth = readRDS(file.path(meth_dir, 
                                "meta_data/220610_NETG1G2_EPIC_normalization_metaData.Rds")) %>% 
    filter(!is.na(CPI455_GR_AOC_median_sens)) %>% 
    dplyr::select(Sample_Name,
                  CPI455_GR_AOC_median_sens,
                  DAXX_ATRX,
                  MEN1)
  
  kdm5a = read.xlsx(file.path(meth_dir, 
                              "220719_NETG1G2_EPIC_CNV_details.xlsx"),
                    sheet = 2) %>% 
    dplyr::select(Sample_Name,
                  avg_CNV_KDM5A = average_CNV,
                  bin_1_KDM5A = `bin_chr12-0005`)
  meta_meth = left_join(meta_meth, 
                        kdm5a,
                        by = "Sample_Name")
  
  kdm5b = read.xlsx(file.path(meth_dir, 
                              "220719_NETG1G2_EPIC_CNV_details.xlsx"),
                    sheet = 3) %>% 
    dplyr::select(Sample_Name,
                  avg_CNV_KDM5B = average_CNV,
                  bin_1_KDM5B = `bin_chr1-1207`,
                  bin_2_KDM5B = `bin_chr1-1208`)
  meta_meth = left_join(meta_meth, 
                        kdm5b,
                        by = "Sample_Name")
  
  
  meta_rna = read_delim(file.path(expression_dir,
                                  "NETG1G2_tumoroids",
                                  "NETG1G2_tumoroids_meta_data.csv"),
                        delim = ",",
                        show_col_types = F) %>% 
    dplyr::select(Sample_Name = aPmP_number_ext, 
                  Sample_ID,
                  Sample_Type,
                  CPI455_GR_AOC_median_sens)
  
  rna_counts = read_delim(file.path(expression_dir, 
                                    "NETG1G2_tumoroids",
                                    "NETG1G2_tumoroids_GRCh38_Ensembl104_STAR_TMM.csv"),
                          delim = ",", 
                          show_col_types = F) %>% 
    filter(gene_name %in% c("KDM5A", "KDM5B", "KDM5C", "KDMD")) %>% 
    as.data.frame()
  rownames(rna_counts) = rna_counts$gene_name
  rna_counts = t(rna_counts[, grepl("sapril", colnames(rna_counts))])
  rna_counts = rna_counts %>% 
    as.data.frame() %>% 
    rownames_to_column("Sample_ID")
  
  meta_rna = left_join(meta_rna,
                       rna_counts,
                       by = "Sample_ID")
  
  left_join(meta_meth,
            meta_rna,
            by = "Sample_Name",
            multiple = "all",
            suffix = c("", "_dup")) %>% 
    mutate(Sample_Type = ifelse(is.na(Sample_Type), "tumor_tissue", Sample_Type))
}

NETG1G2 = loadNETG1G2()

3.1 KDM5A

cowplot::plot_grid(
  NETG1G2 %>% 
    filter(Sample_Type != "tumoroid") %>% 
    arrange(!is.na(KDM5A)) %>% 
    ggplot(aes(x = CPI455_GR_AOC_median_sens,
               y = avg_CNV_KDM5A,
               fill = CPI455_GR_AOC_median_sens)) + 
    geom_boxplot(width = 0.5,
                 alpha = 0.75,
                 show.legend = F) +
    geom_point(size = 2,
               show.legend = F) + 
    geom_text_repel(aes(label = Sample_Name,
                        fill = NULL)) + 
    scale_fill_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "copy number variation",
         y = "average CNV at locus",
         x = "median CPI-455 sensitivity"),
  NETG1G2 %>% 
    filter(Sample_Type != "tumoroid" & !is.na(KDM5A)) %>% 
    ggplot(aes(x = CPI455_GR_AOC_median_sens,
               y = KDM5A,
               fill = CPI455_GR_AOC_median_sens)) + 
    geom_boxplot(width = 0.5,
                 alpha = 0.75,
                 show.legend = F) +
    geom_point(size = 2,
               show.legend = F) + 
    geom_text_repel(aes(label = Sample_Name,
                        fill = NULL)) + 
    scale_fill_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "gene expression",
         y = "log2 TMM gene expression",
         x = "median CPI-455 sensitivity"),
  NETG1G2 %>% 
    filter(!is.na(KDM5A)) %>% 
    ggplot(aes(x = avg_CNV_KDM5A,
               y = KDM5A,
               color = Sample_Name,
               shape = Sample_Type)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  ncol = 3,
  rel_widths = c(2,2,3)
)

  • Higher copy number at the KDM5A locus is correlated with lower sensitivity to CPI-455
  • The inverse is observed for gene expression
  • The gene expression levels are inversely related to the average copy number

3.2 KDM5B

cowplot::plot_grid(
  NETG1G2 %>% 
    filter(Sample_Type != "tumoroid") %>% 
    arrange(!is.na(KDM5B)) %>% 
    ggplot(aes(x = CPI455_GR_AOC_median_sens,
               y = avg_CNV_KDM5B,
               fill = CPI455_GR_AOC_median_sens)) + 
    geom_boxplot(width = 0.5,
                 alpha = 0.75,
                 show.legend = F) +
    geom_point(size = 2,
               show.legend = F) + 
    geom_text_repel(aes(label = Sample_Name,
                        fill = NULL)) + 
    scale_fill_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "copy number variation",
         y = "average CNV at locus",
         x = "median CPI-455 sensitivity"),
  NETG1G2 %>% 
    filter(Sample_Type != "tumoroid" & !is.na(KDM5B)) %>% 
    ggplot(aes(x = CPI455_GR_AOC_median_sens,
               y = KDM5B,
               fill = CPI455_GR_AOC_median_sens)) + 
    geom_boxplot(width = 0.5,
                 alpha = 0.75,
                 show.legend = F) +
    geom_point(size = 2,
               show.legend = F) + 
    geom_text_repel(aes(label = Sample_Name,
                        fill = NULL)) + 
    scale_fill_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "gene expression",
         y = "log2 TMM gene expression",
         x = "median CPI-455 sensitivity"),
  NETG1G2 %>% 
    filter(!is.na(KDM5B)) %>% 
    ggplot(aes(x = avg_CNV_KDM5B,
               y = KDM5B,
               color = Sample_Name,
               shape = Sample_Type)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  ncol = 3,
  rel_widths = c(2,2,3)
)

  • CPI-455 sensitivity tends to be lower when the KDM5B locus is amplified
  • The variability of copy number alterations in both groups is high
  • Expression of KDM5B is not predictive of drug sensitivity
  • There is no correlation between KDM5B copy number and gene expression

The CNV plots show an enrichment of low CNVs in the good responders but variability is very high. Additionally, several of the CNV plots show ambiguous signal (i.e. equal numbers of gains and losses make it hard to estimate the true baseline).

3.3 CNV plots

In PanNETs 3p and 11q are frequently lost but never amplified. This can be used to identify samples with problematic CNV baseline.

3.3.1 Low sensitivity

knitr::include_graphics(file.path("/Bioinformatics/projects/030_NETG1G2_EPIC",
                                  "processed_data/CNA_plots/Illumina_normalized/full_genome",
                                  paste0(c("aP487p", "aP321m", "aP476m", "mP041p"), 
                                         "_CNA.png")))

  • No sample shows amplification of 3p or 11q

3.3.2 High sensitivity

knitr::include_graphics(file.path("/Bioinformatics/projects/030_NETG1G2_EPIC",
                                  "processed_data/CNA_plots/Illumina_normalized/full_genome",
                                  paste0(c("mP078p", "mP058p", "mP044p", "mP072p", "mP029p"), 
                                         "_CNA.png")))

  • No sample shows amplification of 3p or 11q
  • The samples with good response overall have fewer CNVs

3.4 Summary NETG1G2

There is a correlation between CNVs and response to KDM5 inhibition via CPI-455. At the KDM5A locus low response is enriched in amplifications. Over the whole genome low response is liked to a higher number of aberrant chromosomes.

Based on the experience with the samples from Di Domenico 2020 it is expected that baseline correction in most cases results in a reduction of CNV values. It is hard to predict how this might influence the split across CPI-455 sensitivity.

The link between gene expression and CNV is low. This may be a hint that CNV estimates are not sufficiently precise without baseline correction.

4 Cell lines

The gene expression was quantified using very different approaches (QGP1 / BON1: HISAT, NT3 / NT18: STAR). Therefore the expression values are not directly comparable and the data sets are not visualized together.

# The cell lines are loaded together because the CNV information is joined
loadQBNT = function(){
  meth_dir = "/Bioinformatics/projects/044_Cell_lines_hypoxia"
  
  meta_meth = readRDS(file.path(meth_dir, 
                                "meta_data/221212_CEL_metaData.Rds")) %>% 
    dplyr::select(Sample_Name,
                  Cell_Line,
                  Treatment) 
  
  # If the gene CNV information is missing generate it or load from file
  if (!file.exists("processed_data/230614_cell_line_CNV_details.xlsx")){
    anno = createAnno("overlap")
    
    CNV_data = readRDS(file.path(meth_dir, 
                                 "processed_data/221212_CNV_conumee.Rds"))
    CNV_bins = extractCNABins(CNV_data$ilmn)
    
    gene_details = setNames(lapply(detail_regions$name, 
                                   getCNVvalues, 
                                   CNV_bins = CNV_bins,
                                   anno = anno),
                            nm = detail_regions$name)
    
    write.xlsx(gene_details,
               "processed_data/230614_cell_line_CNV_details.xlsx")
    
    rm(CNV_bins)
    gc()
  } 
  kdm5a = read.xlsx("processed_data/230614_cell_line_CNV_details.xlsx",
                    sheet = 2) %>% 
    dplyr::select(Sample_Name,
                  avg_CNV_KDM5A = average_CNV,
                  bin_1_KDM5A = `bin_chr12-0005`)
  
  kdm5b = read.xlsx("processed_data/230614_cell_line_CNV_details.xlsx",
                    sheet = 3) %>% 
    dplyr::select(Sample_Name,
                  avg_CNV_KDM5B = average_CNV,
                  bin_1_KDM5B = `bin_chr1-1207`,
                  bin_2_KDM5B = `bin_chr1-1208`)
  
  meta_meth = left_join(meta_meth, 
                        kdm5a,
                        by = "Sample_Name")
  
  meta_meth = left_join(meta_meth, 
                        kdm5b,
                        by = "Sample_Name")
  
  # The NT3 / NT18 samples names need to be matched 
  meta_meth = meta_meth %>% 
    mutate(Sample_Name = ifelse(grepl("^NT", Sample_Name), 
                                gsub("([0-9])_([NH])", "\\1\\2", Sample_Name),
                                Sample_Name))
  
  NT_counts = read_delim(file.path(expression_dir, 
                                   "Cell_lines_hypoxia",
                                   "Cell_lines_hypoxia_GRCh38_Ensembl104_STAR_TMM.csv"),
                         delim = ",", 
                         show_col_types = F) %>% 
    filter(gene_name %in% c("KDM5A", "KDM5B", "KDM5C", "KDM5B")) %>% 
    as.data.frame()
  rownames(NT_counts) = NT_counts$gene_name
  NT_counts = t(NT_counts[, grepl("^NT", colnames(NT_counts))])
  NT_counts = NT_counts %>% 
    as.data.frame() %>% 
    rownames_to_column("Sample_Name")
  
  QB_counts = read_delim(file.path(expression_dir, 
                                   "Cell_lines_DAXX_ATRX",
                                   "Cell_lines_DAXX_ATRX_GRCh38_Ensembl94_HISAT_TMM.csv"),
                         delim = ",", 
                         show_col_types = F) %>% 
    filter(gene_name %in% c("KDM5A", "KDM5B")) %>% 
    as.data.frame()
  rownames(QB_counts) = QB_counts$gene_name
  QB_counts = t(QB_counts[, grepl("^[BQ]", colnames(QB_counts))])
  # The naming of the samples is so messed up that it is easier to fix manually
  QB_counts = QB_counts %>% 
    as.data.frame() %>% 
    mutate(Sample_Name = c("BON1_DAXX_Cl16",
                           "BON1_DAXX_Cl45_C4",
                           "BON1_DAXX_Cl4_C11",
                           "BON1_ATRX_Cl23",
                           "BON1_ATRX_Cl2",
                           "BON1_ATRX_Cl4",
                           "BON1_Par_1",
                           "BON1_Par_2",
                           "BON1_Par_3",
                           "QGP1_ATRX_Cl12",
                           "QGP1_ATRX_Cl18",
                           "QGP1_ATRX_Cl24",
                           "QGP1_Par_1",
                           "QGP1_Par_2",
                           "QGP1_Par_3"))
  
  rna_counts = bind_rows(NT_counts,
                         QB_counts)
  
  left_join(meta_meth,
            rna_counts,
            by = "Sample_Name",
            suffix = c("", "_dup")) 
}

QBNT = loadQBNT()

4.1 QGP1 BON1 KDM5A

cowplot::plot_grid(
  QBNT %>% 
    filter(grepl("^[QB]", Sample_Name)) %>% 
    ggplot(aes(x = avg_CNV_KDM5A,
               y = KDM5A,
               color = Cell_Line,
               shape = Treatment)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  QBNT %>% 
    filter(grepl("^[QB]", Sample_Name)) %>% 
    ggplot(aes(x = avg_CNV_KDM5A,
               y = KDM5A,
               color = Treatment,
               shape = Cell_Line)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression")
)

  • With one exception all replicates cluster together
  • Compared to the NETG1G2 samples CNV is rather low
  • There is a weak negative correlation between copy number and gene expression

4.2 QGP1 BON1 KDM5B

cowplot::plot_grid(
  QBNT %>% 
    filter(grepl("^[QB]", Sample_Name)) %>% 
    ggplot(aes(x = avg_CNV_KDM5B,
               y = KDM5B,
               color = Cell_Line,
               shape = Treatment)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  QBNT %>% 
    filter(grepl("^[QB]", Sample_Name)) %>% 
    ggplot(aes(x = avg_CNV_KDM5B,
               y = KDM5B,
               color = Treatment,
               shape = Cell_Line)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression")
)

  • Replicates cluster together (variability is relatively high)
  • Almost all samples show a copy number loss at this locus
  • There is a no correlation between copy number and gene expression

4.3 NT3 NT18 KDM5A

cowplot::plot_grid(
  QBNT %>% 
    filter(grepl("^NT", Sample_Name)) %>% 
    ggplot(aes(x = avg_CNV_KDM5A,
               y = KDM5A,
               color = Cell_Line,
               shape = Treatment)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  QBNT %>% 
    filter(grepl("^NT", Sample_Name)) %>% 
    ggplot(aes(x = avg_CNV_KDM5A,
               y = KDM5A,
               color = Treatment,
               shape = Cell_Line)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression")
)

  • Samples most strongly cluster by cell line and then treatment
  • Compared to the NETG1G2 samples CNV is rather low
  • There is a weak positive correlation between copy number and gene expression (in particular when comparing NT3 and NT18LM)

4.4 NT3 NT18 KDM5B

cowplot::plot_grid(
  QBNT %>% 
    filter(grepl("^NT", Sample_Name)) %>% 
    ggplot(aes(x = avg_CNV_KDM5B,
               y = KDM5B,
               color = Cell_Line,
               shape = Treatment)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  QBNT %>% 
    filter(grepl("NT", Sample_Name)) %>% 
    ggplot(aes(x = avg_CNV_KDM5B,
               y = KDM5B,
               color = Treatment,
               shape = Cell_Line)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression")
)

  • Samples most strongly cluster by cell line
  • The treatment effect is very weak
  • CNVs are very small at this locus
  • There is a weak negative correlation between copy number and gene expression (in particular when comparing NT3 and NT18P)

The cell lines confirm that the link between copy number and gene expression of KDM5A / KDM5B is quite weak. The correlation varies between tumors and the different cell lines making it hard to predict gene expression from CNV.

4.5 Summary cell lines

All cell lines have low CNV and based on this are expected to respond well to CPI-455 inhibition. Due to experimental differences it is not trivial to compare the gene expression values to predict drug response. Based on relative gene expression QGP1 are expected to be slightly more sensitive than BON1 and NT18LM more sensitive compared to NT3.

For further analysis of the cell lines more meta data such as the DAXX / ATRX and MEN1 mutation status and ideally a measure of CPI-455 response will be beneficial.

5 Di Domenico 2020

The expression data is analyzed using the same toolchain (STAR + RSEM) and may be compared directly.

CNV data is available for 81 samples. Gene expression data is available for 42 samples. For 19 both data types exist.
As mentioned above CNV values were corrected and are not mixed with uncorrected estimates for the missing samples.

# Gene expression and methylation data are only available for some of the samples 
# The union of samples is loaded 
loadNDD = function(){
  meta_meth = read_delim(file.path("/Bioinformatics/projects/shared_metadata/PanNET_methylation", 
                                   "221004_PanNET_methylation_annotation_DiDomenico_full.txt"),
                         delim = "\t", 
                         show_col_types = F) %>% 
    dplyr::select(Sample_Name,
                  MEN1,
                  DAXX_ATRX,
                  CC_Epi_newLRO) 
  
  if (!file.exists("processed_data/230614_NDD_CNV_details.xlsx")){
    # The chromosome arm data from Nunzia was found under 5.NDD\DNAme\Analysis\CNA\CNA_corr_Meth_And_LOH_ICGC_UCL_UB/df_Plot_Bins_AllSamples_noNA
    # This plot data contains the corrected bins for most samples (UB06 is missing for some unspecified reason)
    # There likely is a reason for this in the previous analysis steps
    # Some bins were removed on accident if the End is the same as the chromosome arm border 
    # This should not be problematic due to the low number and location close to the centrosomes
    # The legend of the heatmap is misleading because the cutoffs are not -1 and 1 but rather -0.2 and 0.2
    load(file.path("/Bioinformatics/projects/045_Aurel_one_PanNET",
                   "processed_data/df_Plot_Bins_NoNA_AllSamples.RDa"))
    
    anno = createAnno("450k")
    
    bin_data = anno@bins %>%
      as.data.frame() %>% 
      rownames_to_column("Feature") %>% 
      mutate(BinName = paste0(seqnames, "_", start, "_", end))
    
    CNV_bins = left_join(df_Plot_Bins_AllSamples_noNA,
                         bin_data %>% 
                           dplyr::select(Feature, 
                                         BinName),
                         by = "BinName")  %>% 
      dplyr::select(-c(BinName, 
                       Chromosome,
                       ChrArm.x)) %>% 
      pivot_longer(names_to = "Sample_Name",
                   values_to = "CNV",
                   -c(Feature, 
                      Start,
                      End))
    
    gene_details = setNames(lapply(detail_regions$name, 
                                   getCNVvalues, 
                                   CNV_bins = CNV_bins,
                                   anno = anno),
                            nm = detail_regions$name)
    
    write.xlsx(gene_details,
               "processed_data/230614_NDD_CNV_details.xlsx")
  } 
  
  kdm5a = read.xlsx("processed_data/230614_NDD_CNV_details.xlsx",
                    sheet = 2) %>% 
    dplyr::select(Sample_Name,
                  avg_CNV_KDM5A = average_CNV,
                  bin_1_KDM5A = `bin_chr12-0005`)
  
  kdm5b = read.xlsx("processed_data/230614_NDD_CNV_details.xlsx",
                    sheet = 3) %>% 
    dplyr::select(Sample_Name,
                  avg_CNV_KDM5B = average_CNV,
                  bin_1_KDM5B = `bin_chr1-1269`,
                  bin_2_KDM5B = `bin_chr1-1270`)
  
  meta_meth = left_join(meta_meth, 
                        kdm5a,
                        by = "Sample_Name")
  
  meta_meth = left_join(meta_meth, 
                        kdm5b,
                        by = "Sample_Name")
  
  # all samples matched?
  
  Scarpa_counts = read_delim(file.path(expression_dir, 
                                       "Scarpa_2017",
                                       "Scarpa_2017_GRCh37_Ensembl75_RSEM_TMM.csv"),
                             delim = ",", 
                             show_col_types = F) %>% 
    filter(gene_name %in% c("KDM5A", "KDM5B", "KDM5C", "KDMD")) %>% 
    as.data.frame()
  rownames(Scarpa_counts) = Scarpa_counts$gene_name
  Scarpa_counts = t(Scarpa_counts[, grepl("^ICGC", colnames(Scarpa_counts))])
  Scarpa_counts = Scarpa_counts %>% 
    as.data.frame() %>% 
    rownames_to_column("Sample_Name")
  
  Chan_counts = read_delim(file.path(expression_dir, 
                                     "Chan_2018",
                                     "Chan_2018_GRCh37_Ensembl75_RSEM_TMM.csv"),
                           delim = ",", 
                           show_col_types = F) %>% 
    filter(gene_name %in% c("KDM5A", "KDM5B", "KDM5C", "KDMD")) %>% 
    as.data.frame()
  rownames(Chan_counts) = Chan_counts$gene_name
  Chan_counts = t(Chan_counts[, grepl("[0-9]$", colnames(Chan_counts))])
  # The naming of the samples is so messed up that it is easier to fix manually
  Chan_counts = Chan_counts %>% 
    as.data.frame() %>% 
    rownames_to_column("Sample_Name")
  
  rna_counts = bind_rows(Scarpa_counts,
                         Chan_counts)
  
  left_join(meta_meth,
            rna_counts,
            by = "Sample_Name",
            suffix = c("", "_dup")) 
}

NDD = loadNDD()

5.1 CNV vs gene expression KDM5A

cowplot::plot_grid(
  NDD %>% 
    filter(!is.na(NDD$KDM5A) & !is.na(NDD$avg_CNV_KDM5A)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan")) %>% 
    ggplot(aes(x = avg_CNV_KDM5A,
               y = KDM5A,
               color = Cohort)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5A) & !is.na(NDD$avg_CNV_KDM5A)) %>% 
    ggplot(aes(x = avg_CNV_KDM5A,
               y = KDM5A,
               color = CC_Epi_newLRO)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5A) & !is.na(NDD$avg_CNV_KDM5A)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan")) %>% 
    ggplot(aes(x = avg_CNV_KDM5A,
               y = KDM5A,
               color = DAXX_ATRX)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5A) & !is.na(NDD$avg_CNV_KDM5A)) %>% 
    ggplot(aes(x = avg_CNV_KDM5A,
               y = KDM5A,
               color = MEN1)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  align = "v"
)

  • The samples partially cluster by epigenetic groups but Intermediate_ADM shows very different KDM5 expression
  • CNV values are lower than in the NETG1G2 cohort and often negative
  • Losses at the KDM5A locus are correlated with lower expression values
  • There is a tendency towards higher KDM5A expression in DAXX / ATRX mutant samples

5.2 CNV vs gene expression KDM5B

cowplot::plot_grid(
  NDD %>% 
    filter(!is.na(NDD$KDM5B) & !is.na(NDD$avg_CNV_KDM5B)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan")) %>% 
    ggplot(aes(x = avg_CNV_KDM5B,
               y = KDM5B,
               color = Cohort)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5B) & !is.na(NDD$avg_CNV_KDM5B)) %>% 
    ggplot(aes(x = avg_CNV_KDM5B,
               y = KDM5B,
               color = CC_Epi_newLRO)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5B) & !is.na(NDD$avg_CNV_KDM5B)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan")) %>% 
    ggplot(aes(x = avg_CNV_KDM5B,
               y = KDM5B,
               color = DAXX_ATRX)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5B) & !is.na(NDD$avg_CNV_KDM5B)) %>% 
    ggplot(aes(x = avg_CNV_KDM5B,
               y = KDM5B,
               color = MEN1)) + 
    geom_point(size = 3) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    labs(title = "CNV vs gene expression", 
         x = "average CNV at locus",
         y = "log2 TMM gene expression"),
  align = "v"
)

  • The samples partially cluster by epigenetic groups but Intermediate_ADM shows very different KDM5 expression
  • CNV values are lower than in the NETG1G2 cohort and often negative
  • There is no correlation between CNV and gene expression
  • Neither DAXX / ATRX nor MEN1 mutation status is clearly correlated with KDM5B expression

All samples have low CNV and based on this are expected to respond well to CPI-455 inhibition. When predicting CPI-455 response from gene expression two clusters with different expected sensitivity can be constructed.

5.3 All KDM5A CNVs

cowplot::plot_grid(
  NDD %>% 
    filter(!is.na(NDD$avg_CNV_KDM5A)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(avg_CNV_KDM5A)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = avg_CNV_KDM5A,
               color = Cohort)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "CNV variability", 
         y = "average CNV at locus"),
  NDD %>% 
    filter(!is.na(NDD$avg_CNV_KDM5A)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(avg_CNV_KDM5A)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = avg_CNV_KDM5A,
               color = CC_Epi_newLRO)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "CNV variability", 
         y = "average CNV at locus"),
  NDD %>% 
    filter(!is.na(NDD$avg_CNV_KDM5A)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(avg_CNV_KDM5A)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = avg_CNV_KDM5A,
               color = DAXX_ATRX)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1", na.value = "grey50") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "CNV variability", 
         y = "average CNV at locus"),
  NDD %>% 
    filter(!is.na(NDD$avg_CNV_KDM5A)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(avg_CNV_KDM5A)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = avg_CNV_KDM5A,
               color = MEN1)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "CNV variability", 
         y = "average CNV at locus"),
  ncol = 1, 
  align = "v"
)

  • CNVs are generally lower in the Scarpa cohort
  • Higher CNV samples are enriched in the Intermediate_ADM epigenetic group
  • The few Intermediate_ADM samples with low CNV are all from the Scarpa group
  • DAXX / ATRX mutation is associated with amplifications at the KDM5A locus

The separation of the cohorts may indicate a technical effect driving CNV estimates

5.4 All KDM5B CNVs

cowplot::plot_grid(
  NDD %>% 
    filter(!is.na(NDD$avg_CNV_KDM5B)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(avg_CNV_KDM5B)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = avg_CNV_KDM5B,
               color = Cohort)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "CNV variability", 
         y = "average CNV at locus"),
  NDD %>% 
    filter(!is.na(NDD$avg_CNV_KDM5B)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(avg_CNV_KDM5B)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = avg_CNV_KDM5B,
               color = CC_Epi_newLRO)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "CNV variability", 
         y = "average CNV at locus"),
  NDD %>% 
    filter(!is.na(NDD$avg_CNV_KDM5B)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(avg_CNV_KDM5B)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = avg_CNV_KDM5B,
               color = DAXX_ATRX)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1", na.value = "grey50") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "CNV variability", 
         y = "average CNV at locus"),
  NDD %>% 
    filter(!is.na(NDD$avg_CNV_KDM5B)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(avg_CNV_KDM5B)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = avg_CNV_KDM5B,
               color = MEN1)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "CNV variability", 
         y = "average CNV at locus"),
  ncol = 1, 
  align = "v"
)

  • CNVs are generally higher in the Scarpa cohort
  • Lower CNV samples are enriched in the Intermediate_ADM epigenetic group
  • DAXX / ATRX mutation is associated with absence of amplifications at the KDM5A locus

KDM5A and KDM5B display an inverted pattern.

5.5 All KDM5A gene expression

cowplot::plot_grid(
  NDD %>% 
    filter(!is.na(NDD$KDM5A)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(KDM5A)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = KDM5A,
               color = Cohort)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "gene expression variability", 
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5A)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(KDM5A)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = KDM5A,
               color = CC_Epi_newLRO)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "gene expression variability", 
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5A)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(KDM5A)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = KDM5A,
               color = DAXX_ATRX)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1", na.value = "grey50") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "gene expression variability", 
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5A)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(KDM5A)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = KDM5A,
               color = MEN1)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "gene expression variability", 
         y = "log2 TMM gene expression"),
  ncol = 1, 
  align = "v"
)

  • KDM5A expression is generally higher in the Scarpa cohort
  • There is no obvious correlation between KDM5 expression and the epigenetic groups
  • There is no obvious correlation between KDM5 expression and DAXX / ATRX or MEN1 mutation status

5.6 All KDM5B gene expression

cowplot::plot_grid(
  NDD %>% 
    filter(!is.na(NDD$KDM5B)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(KDM5B)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = KDM5B,
               color = Cohort)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "gene expression variability", 
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5B)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(KDM5B)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = KDM5B,
               color = CC_Epi_newLRO)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "gene expression variability", 
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5B)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(KDM5B)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = KDM5B,
               color = DAXX_ATRX)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1", na.value = "grey50") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "gene expression variability", 
         y = "log2 TMM gene expression"),
  NDD %>% 
    filter(!is.na(NDD$KDM5B)) %>% 
    mutate(Cohort= ifelse(grepl("ICGC", Sample_Name), "Scarpa", "Chan"),
           Sample_Name = factor(Sample_Name, 
                                levels = Sample_Name[order(KDM5B)])) %>% 
    ggplot(aes(x = Sample_Name,
               y = KDM5B,
               color = MEN1)) + 
    geom_point(size = 2) +
    scale_color_brewer(palette = "Set1") +
    theme_classic() + 
    theme(axis.title.x = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 5)) + 
    labs(title = "gene expression variability", 
         y = "log2 TMM gene expression"),
  ncol = 1, 
  align = "v"
)

  • KDM5B expression is generally higher in the Scarpa cohort
  • There is no obvious correlation between KDM5 expression and the epigenetic groups
  • There is a tendency towards higher KDM5B expression in DAXX / ATRX wt samples; The trend for MEN1 may be inverted

5.7 Summary Di Domenico

In contrast to the NETG1G2 samples all CNVs are rather low. The FISH based correction always resulted in an decrease of the CNVs calculated from DNA methylation data. Therefore the NETG1G2 values may be overestimates.

Due to the high spread in CNV and especially KDM5A / KDM5B expression it would be expected to observe difference in response to CPI-455 in these cases.

Many patterns are strongly correlated with the difference between the Scarpa and Chan data.

sessionInfo()
## R version 4.2.2 (2022-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.2 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
##  [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
## [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] conumee_1.32.0                                     
##  [2] IlluminaHumanMethylationEPICmanifest_0.3.0         
##  [3] IlluminaHumanMethylationEPICanno.ilm10b2.hg19_0.6.0
##  [4] IlluminaHumanMethylation450kmanifest_0.4.0         
##  [5] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1 
##  [6] minfi_1.44.0                                       
##  [7] bumphunter_1.40.0                                  
##  [8] locfit_1.5-9.7                                     
##  [9] iterators_1.0.14                                   
## [10] foreach_1.5.2                                      
## [11] Biostrings_2.66.0                                  
## [12] XVector_0.38.0                                     
## [13] SummarizedExperiment_1.28.0                        
## [14] Biobase_2.58.0                                     
## [15] MatrixGenerics_1.10.0                              
## [16] matrixStats_0.63.0                                 
## [17] GenomicRanges_1.50.2                               
## [18] GenomeInfoDb_1.34.9                                
## [19] IRanges_2.32.0                                     
## [20] S4Vectors_0.36.2                                   
## [21] BiocGenerics_0.44.0                                
## [22] ggrepel_0.9.3                                      
## [23] openxlsx_4.2.5.2                                   
## [24] lubridate_1.9.2                                    
## [25] forcats_1.0.0                                      
## [26] stringr_1.5.0                                      
## [27] dplyr_1.1.0                                        
## [28] purrr_1.0.1                                        
## [29] readr_2.1.4                                        
## [30] tidyr_1.3.0                                        
## [31] tibble_3.2.0                                       
## [32] ggplot2_3.4.1                                      
## [33] tidyverse_2.0.0                                    
## 
## loaded via a namespace (and not attached):
##   [1] BiocFileCache_2.6.1       plyr_1.8.8                splines_4.2.2            
##   [4] BiocParallel_1.32.6       digest_0.6.31             htmltools_0.5.4          
##   [7] fansi_1.0.4               magrittr_2.0.3            memoise_2.0.1            
##  [10] tzdb_0.3.0                limma_3.54.2              annotate_1.76.0          
##  [13] vroom_1.6.1               askpass_1.1               timechange_0.2.0         
##  [16] siggenes_1.72.0           prettyunits_1.1.1         colorspace_2.1-0         
##  [19] blob_1.2.3                rappdirs_0.3.3            xfun_0.37                
##  [22] jsonlite_1.8.4            crayon_1.5.2              RCurl_1.98-1.10          
##  [25] genefilter_1.80.3         GEOquery_2.66.0           survival_3.4-0           
##  [28] glue_1.6.2                gtable_0.3.1              zlibbioc_1.44.0          
##  [31] DelayedArray_0.24.0       Rhdf5lib_1.20.0           HDF5Array_1.26.0         
##  [34] scales_1.2.1              DBI_1.1.3                 rngtools_1.5.2           
##  [37] Rcpp_1.0.10               xtable_1.8-4              progress_1.2.2           
##  [40] bit_4.0.5                 mclust_6.0.0              preprocessCore_1.60.2    
##  [43] httr_1.4.5                RColorBrewer_1.1-3        ellipsis_0.3.2           
##  [46] farver_2.1.1              pkgconfig_2.0.3           reshape_0.8.9            
##  [49] XML_3.99-0.13             sass_0.4.5                dbplyr_2.3.1             
##  [52] DNAcopy_1.72.3            utf8_1.2.3                labeling_0.4.2           
##  [55] tidyselect_1.2.0          rlang_1.0.6               AnnotationDbi_1.60.2     
##  [58] munsell_0.5.0             tools_4.2.2               cachem_1.0.7             
##  [61] cli_3.6.0                 generics_0.1.3            RSQLite_2.3.0            
##  [64] evaluate_0.20             fastmap_1.1.1             yaml_2.3.7               
##  [67] knitr_1.42                bit64_4.0.5               zip_2.2.2                
##  [70] beanplot_1.3.1            scrime_1.3.5              KEGGREST_1.38.0          
##  [73] nlme_3.1-160              doRNG_1.8.6               sparseMatrixStats_1.10.0 
##  [76] nor1mix_1.3-0             xml2_1.3.3                biomaRt_2.54.1           
##  [79] compiler_4.2.2            rstudioapi_0.14           filelock_1.0.2           
##  [82] curl_5.0.0                png_0.1-8                 bslib_0.4.2              
##  [85] stringi_1.7.12            highr_0.10                GenomicFeatures_1.50.4   
##  [88] lattice_0.20-45           Matrix_1.5-1              multtest_2.54.0          
##  [91] vctrs_0.5.2               pillar_1.8.1              lifecycle_1.0.3          
##  [94] rhdf5filters_1.10.1       jquerylib_0.1.4           data.table_1.14.8        
##  [97] cowplot_1.1.1             bitops_1.0-7              rtracklayer_1.58.0       
## [100] R6_2.5.1                  BiocIO_1.8.0              codetools_0.2-18         
## [103] MASS_7.3-58.1             rhdf5_2.42.1              openssl_2.0.6            
## [106] rjson_0.2.21              withr_2.5.0               GenomicAlignments_1.34.1 
## [109] Rsamtools_2.14.0          GenomeInfoDbData_1.2.9    hms_1.1.2                
## [112] quadprog_1.5-8            grid_4.2.2                base64_2.0.1             
## [115] rmarkdown_2.20            DelayedMatrixStats_1.20.0 illuminaio_0.40.0        
## [118] restfulr_0.0.15